Nonparametric density estimation and optimal bandwidth selection for protein unfolding and unbinding data.

نویسندگان

  • E Bura
  • A Zhmurov
  • V Barsegov
چکیده

Dynamic force spectroscopy and steered molecular simulations have become powerful tools for analyzing the mechanical properties of proteins, and the strength of protein-protein complexes and aggregates. Probability density functions of the unfolding forces and unfolding times for proteins, and rupture forces and bond lifetimes for protein-protein complexes allow quantification of the forced unfolding and unbinding transitions, and mapping the biomolecular free energy landscape. The inference of the unknown probability distribution functions from the experimental and simulated forced unfolding and unbinding data, as well as the assessment of analytically tractable models of the protein unfolding and unbinding requires the use of a bandwidth. The choice of this quantity is typically subjective as it draws heavily on the investigator's intuition and past experience. We describe several approaches for selecting the "optimal bandwidth" for nonparametric density estimators, such as the traditionally used histogram and the more advanced kernel density estimators. The performance of these methods is tested on unimodal and multimodal skewed, long-tailed distributed data, as typically observed in force spectroscopy experiments and in molecular pulling simulations. The results of these studies can serve as a guideline for selecting the optimal bandwidth to resolve the underlying distributions from the forced unfolding and unbinding data for proteins.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bandwidth Selection in Kernel Density Estimation: a Review

Allthough nonparametric kernel density estimation is nowadays a standard technique in explorative data{analysis, there is still a big dispute on how to assess the quality of the estimate and which choice of bandwidth is optimal. The main argument is on whether one should use the Integrated Squared Error or the Mean Integrated Squared Error to deene the optimal bandwidth. In the last years a lot...

متن کامل

nprobust: Nonparametric Kernel-Based Estimation and Robust Bias-Corrected Inference∗

Nonparametric kernel density and local polynomial regression estimators are very popular in Statistics, Economics, and many other disciplines. They are routinely employed in applied work, either as part of the main empirical analysis or as a preliminary ingredient entering some other estimation or inference procedure. This article describes the main methodological and numerical features of the ...

متن کامل

Optimal Bandwidth Selection for Nonparametric Conditional Distribution and Quantile Functions

Li & Racine (2008) consider the nonparametric estimation of conditional cumulative distribution functions (CDF) in the presence of discrete and continuous covariates along with the associated conditional quantile function. However, they did not propose an optimal data-driven method of bandwidth selection and left this important problem as an ‘open question’. In this paper we propose an automati...

متن کامل

Graphics processing units in acceleration of bandwidth selection for kernel density estimation

The Probability Density Function (PDF) is a key concept in statistics. Constructing the most adequate PDF from the observed data is still an important and interesting scientific problem, especially for large datasets. PDFs are often estimated using nonparametric data-driven methods. One of the most popular nonparametric method is the Kernel Density Estimator (KDE). However, a very serious drawb...

متن کامل

Optimal bandwidth selection for robust generalized method of moments estimation

A two-step generalized method of moments estimation procedure can be made robust to heteroskedasticity and autocorrelation in the data by using a nonparametric estimator of the optimal weighting matrix. This paper addresses the issue of choosing the corresponding smoothing parameter (or bandwidth) so that the resulting point estimate is optimal in a certain sense. We derive an asymptotically op...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • The Journal of chemical physics

دوره 130 1  شماره 

صفحات  -

تاریخ انتشار 2009